PSU: Particle Stacking Undersampling Method for Highly Imbalanced Big Data
نویسندگان
چکیده
منابع مشابه
ClusterOSS: a new undersampling method for imbalanced learning
A dataset is said to be imbalanced when its classes are disproportionately represented in terms of the number of instances they contain. This problem is common in applications such as medical diagnosis of rare diseases, detection of fraudulent calls, signature recognition. In this paper we propose an alternative method for imbalanced learning, which balances the dataset using an undersampling s...
متن کاملAddressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling
In the classification framework there are problems in which the number of examples per class is not equitably distributed, formerly known as imbalanced data sets. This situation is a handicap when trying to identify the minority classes, as the learning algorithms are not usually adapted to such characteristics. An usual approach to deal with the problem of imbalanced data sets is the use of a ...
متن کاملPredictive Data Mining for Highly Imbalanced Classification
The paper addresses some theoretical and practical aspects of data mining, focusing on predictive data mining, where two central types of prediction problems are discussed: classification and regression. Further accent is made on predictive data mining, where the time-stamped data greatly increase the dimensions and complexity of problem solving. The main goal is through processing of data (rec...
متن کاملEvolutionary Undersampling for Classification with Imbalanced Datasets: Proposals and Taxonomy
Learning with imbalanced data is one of the recent challenges in machine learning. Various solutions have been proposed in order to find a treatment for this problem, such as modifying methods or the application of a preprocessing stage. Within the preprocessing focused on balancing data, two tendencies exist: reduce the set of examples (undersampling) or replicate minority class examples (over...
متن کاملA Novel Approach for Handling Imbalanced Data in Medical Diagnosis using Undersampling Technique
In many data mining applications the imbalanced learning problem is becoming ubiquitous nowadays. When the data sets have an unequal distribution of samples among classes, then these data sets are known as imbalanced data sets. When such highly imbalanced data sets are given to any classifier, then classifier may misclassify the rare samples from the minority class. To deal with such type of im...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2020
ISSN: 2169-3536
DOI: 10.1109/access.2020.3009753